Genetic risk prediction is an important component of individualized medicine,but prediction accuracies remain low for many complex diseases. A fundamentallimitation is the sample sizes of the studies on which the predictionalgorithms are trained. One way to increase the effective sample size is tointegrate information from previously existing studies. However, it can bedifficult to find existing data that examine the target disease of interest,especially if that disease is rare or poorly studied. Furthermore,individual-level genotype data from these auxiliary studies are typicallydifficult to obtain. This paper proposes a new approach to integrative geneticrisk prediction of complex diseases with binary phenotypes. It accommodatespossible heterogeneity in the genetic etiologies of the target and auxiliarydiseases using a tuning parameter-free nonparametric empirical Bayes procedure,and can be trained using only auxiliary summary statistics. Simulation studiesshow that the proposed method can provide superior predictive accuracy relativeto non-integrative as well as integrative classifiers. The method is applied toa recent study of pediatric autoimmune diseases, where it substantially reducesprediction error for certain target/auxiliary disease combinations. Theproposed method is implemented in the R package ssa.
展开▼